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Abstract 

The main goal of this paper is to derive sufficient conditions for the existence of an optimal 
control strategy for the long run average continuous control problem of piecewise deterministic 
Markov processes (PDMP's) taking values in a general Borel space and with compact action space 
depending on the state variable. In order to do that we apply the so-called vanishing discount 
approach (see [3], page 83) to obtain a solution to an average cost optimality inequality associ- 
ated to the long run average cost problem. Our main assumptions are written in terms of some 
integro-differential inequalities related to the so-called expected growth condition, and geometric 
convergence of the post-jump location kernel associated to the PDMP. 
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1 Introduction 



A general family of non-diffusion stochastic models suitable for formulating optimization problems in 
several areas of operations research, namely piecewise-deterministic Markov processes (PDMP's), was 
introduced in [6j and [8|. These processes are determined by three local characteristics; the flow (j), 
the jump rate A, and the transition measure Q. Starting from x the motion of the process follows the 
flow (p(x, t) until the first jump time T\ which occurs either spontaneously in a Poisson-like fashion 
with rate A or when the flow (f)(x, t) hits the boundary of the state-space. In either case the location 
of the process at the jump time T\ is selected by the transition measure Q((j)(x, Ti), .) and the motion 
restarts from this new point as before. A suitable choice of the state space and the local characteristics 
4>, A, and Q provide stochastic models covering a great number of problems of operations research 0]. 

There exist two types of control for PDMP's: continuous control and impulse control. This ter- 
minology has been introduced by M.H.A. Davis in 0, page 134] where continuous control is used to 
describe situations in which the control variable acts at all times on the process through the character- 
istics ((j), A, Q) by influencing the deterministic motion and the probability of the jumps. On the other 
hand the terminology impulse control refers to a control that intervenes on the process by moving it 
to a new point of the state space at some times specifed by the controller. 

In 0] it was studied the long run average continuous control problem of PDMP's taking values in 
a general Borel space. At each point x of the state space a control variable is chosen from a compact 
action set U(x) and is applied on the jump parameter A and transition measure Q. The goal was to 
minimize the long run average cost, which is composed of a running cost and a boundary cost (which 
is added each time the PDMP touches the boundary). Both costs are assumed to be positive but 
not necessarily bounded. As far as the authors are aware of, this was the first time that this kind of 
problem was considered in the literature. Indeed, results are available for the long run average cost 
problem but for impulse control see Costa [3], Gatarek [HI and the book by M.H.A. Davis [s| (see 
the references therein). On the other hand, the continuous control problem has been studied only 
for discounted costs by A. Almudevar [l|, M.H.A. Davis 0,0], M.A.H. Dempster and J.J. Ye 
Forwick, Schal, and Schmitz 0], M. Schal 0, A. A. Yushkevich @,El|]. 



This paper deals with the vanishing approach for the long run average continuous control problem 
of a PDMP and can be seen as a continuation of the results derived in 0]. By exploiting the special 
features of the PDMP's we trace a parallel with the general theory for discrete-time Markov Decision 
Processes (see, for instance, 15, 03]) rather than the continuous-time case (see, for instance [iil]). 



The two main reasons for doing that is to use the powerful tools developed in the discrete-time 
framework (see for example the references (3. [ill. 16, 17]) and to avoid working with the infinitesimal 



generator associated to a PDMP, which in most cases has its domain of definition difficult to be 
characterized. We develop further on the approach presented by the authors in [j] which consists 
of using a connection between the continuous-time control problem of a PDMP and a discrete-time 
optimality equation (see the introduction of section 0] for a detailed explanation of this method) . In 
particular, we derive sufficient conditions under which a boundedness condition (with the lower bound 
being a function rather than a constant as supposed in 0]) on the value functions for the discounted 
problems is satisfied. The main assumptions for this are based on some integro-differential inequalities 
related to the so-called expected growth condition (see Assumption 13. ip . and geometric convergence 
of the post-jump location kernel associated to the PDMP (see Assumption I3.6P . As a consequence, we 
obtain a result of existence of an optimal ordinary control strategy for the long run average control 
problem of a PDMP having the important property of being in a feedback form. 
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The paper is organized in the following way. In section [2] we introduce some notation, basic 
assumptions, and the problem formulation. In section [3] we introduce several assumptions related 
to the continuity of the parameters, the expected growth condition and geometric convergence of 
the post-jump location of the PDMP. In the sequence we provide several key auxiliary results for 
obtaining a bound for the discounted problems, and some extensions of the results presented in 0] 
to the case in which the functions under consideration are not necessarily positive but just bounded 
by a test function g. The main results are presented in section HJ which provides sufficient conditions 
for the existence of an optimal control strategy for the long run average continuous control problem 
of a PDMP and obtain a solution to an average cost optimality inequality associated to the long run 
average cost problem. 



2 Notation, basic assumptions, and problem formulation 

2.1 Presentation of the control problem 

In this section we present some standard notation and some basic definitions related to the motion 
of a PDMP {X(t)}, and the control problems we will consider throughout the paper. For further 
details and properties the reader is referred to 0]. The following notation will be used in this paper: 
N denotes the set of natural numbers, R the set of real numbers, R+ the set of positive real numbers 
and R d the cf-dimensional euclidian space. We write r\ as the Lebesgue measure on R. For X a metric 
space B(X) represents the u-algebra generated by the open sets of X. M(X) (respectively, V(X)) 
denotes the set of all finite (respectively probability) measures on (X, B(X)). Let X and Y be metric 
spaces. The set of all Borel measurable (respectively bounded) functions from X into Y is denoted 
by M(X; Y) (respectively M(X;Y)). Moreover, for notational simplicity M(X) (respectively M(X), 
Mpf)+, B(X)+) denotes M(X;R) (respectively B(X;R), M{X;R+), B(X;R + )). For g G M(X) with 

\v( x I 

g(x) > for all M q (X) is the set of functions v G M.(X) such that ||u(x)|L = sup — — - < +oo. 

x&x g(x) 

C(X) denotes the set of continuous functions from X to R. For h G M(2£), h + (respectively h~) 
denotes the positive (respectively, negtive) part of h. 

Let E be an open subset of R n , dE its boundary, and E its closure. A controlled PDMP is determined 
by its local characteristics (<p,X,Q), as presented in the sequel. The flow 4>(x,t) is a function <p : 
R n x R_|_ — * R n continuous in (x, t) and such that 4>(x, t + s) = 4>((p(x, t),s). For each x £ E the time 
the flow takes to reach the boundary starting from x is defined as t*(x) = inf{t > : 4>{x,t) G dE}. 
For x E E such that t*(x) = oo (that is, the flow starting from x never touches the boundary), we set 
<f>(x, i*(x)) = A, where A is a fixed point in dE. We define the following space of functions absolutely 
continuous along the flow with limit towards the boundary: 

M ac (E) = [g 6 M(E) : g{<j){x,t)) : [0,t*(x)) h-» R is absolutely continuous for each x G E 

and whenever tAx) < oo the limit lim g(cp(x,t)) exists). 

t-*U(x) 

For g G Wl ac (E) and z G dE for which there exists x G E such that z = 0(x,i*(x)) where t*(x) < oo 

we define g(z) = lim g(<p(x,t)) (note that the limit exists by assumption). As shown in Lemma 2 

t— >t*(x) 

in [1], for g G M ac (E) there exists a function Xg G M(E) such that for all x G E and t G [0,i*(x)) 
g(4>(x,t)) -g(x) = ^Xg(<j){x,s))ds. 

The local characteristics A and Q depend on a control action u G U where U is a compact metric space 
(there is no loss of generality in assuming this property for U, see Remark 2.8 in 0]), in the following 
way: A G M.(E x U) + and Q is a stochastic kernel on E given E xV. For each x G E we define the 
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subsets U(x) of U as the set of feasible control actions that can be taken when the state process is in 
x G E, that is, the control action that will be applied to A and Q must belong to U(x). The following 
assumptions, based on the standard theory of Markov decision processes (see for example [3]), will 
be made throughout the paper: 

Assumption 2.1 For all x G E, U(x) is a compact subspace o/U. 

Assumption 2.2 The set K = {(x,a) : x G E,a G U(x)} is a Borel subset of E x U. 

We present next the definition of an admissible control strategy and the associated motion of the 
controlled process. A control policy U is a pair of functions (it, ug) G M(N x Ex R + ; U) x M(N x E; U) 
satisfying u(n, x, t) G V((j)(x, t)), and ug(n, x) G U((f>(x, i*(x))) for all (n,x,t) G N x E x R + . The class 
of admissible control strategies will be denoted by IA. Consider the state space E = E x E x R + x N. 
For a control policy U = (u, uq) let us introduce the following parameters for x = (x, z, s, n) G E: the 
flow <ft(x, t) = (4>(x, t), z,s + t, n), the jump rate X u (x) = \(x, u(n, z, s)), and the transition measure 

«»Mx b x {o} * { „ + 1 } ) = \fr in rf: A n n * ' t 1 ! »; 

IQ(i,ua(n, z); AnB) if x G oE, 

for ^4 and B in 13(E). From 0, section 25], it can be shown that for any control strategy U = (u, ug) G U 
there exists a filtered probability space (n,F,{F t },{PV } ieS ) such that the piecewise deterministic 

Markov process {X u (t)} with local characteristics ((f), X u ,Q U ) may be constructed as follows. For 
notational simplicity the probability P*! Q will be denoted by PV ^ for xq = (x, x, 0, k) G E. Take a 
random variable T\ such that 



P(a>,k)(Tl > t) 



e -A v (x,k,t) f or ^ < ^^-j 
for i > Ux) 



where for x G E and t G [0, t*(x)[, A c/ (x, fe, t) = J ' A(0(x, s), u(fc, x, s))ds. If Tj is equal to infinity, then 
for t G M+, X u (t) = (cf>(x,t),x,t,k). Otherwise select independently an ^-valued random variable 
(labelled X^) having distribution 

pg k) (xY g a x b x {0} x {k + iywm) = [Qi^Mk^mAnB) if ^ Tl ) g S| 

1 W 1 \Q(0(x,ri),w 9 (A;,x);AnB) if <p(x,Ti) G 



The trajectory of {^^(t)} starting from (x, x, 0, A;), for t <T\ , is given by 



((/>(x, t), x, t, fc) for t < T\, 
XY for t = Ti. 



Starting from X u (Tx) = Xj 7 , we now select the next inter-jump time T2 — T\ and post-jump location 
X u (T2) = X2 in a similar way. Let us define the components of the PDMP {X u (t)} by X u (t) = 
(X(t), Z(t),r(t), N(t)) . From the previous construction, it is easy to see that X(t) corresponds to the 
trajectory of the system, Z(t) is the value of X(t) at the last jump time before t, r(t) is time elapsed 
between the last jump and time t, and N(t) is the number of jumps of the process {X(t)} at time t. 
As in Davis [a], we consider the following assumption to avoid any accumulation point of the jump 
times: 
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Assumption 2.3 For any x G E, U = (u, uq) G U, and t > 0, we have EY 

The costs of our control problem will contain two terms, a running cost / and a boundary cost r, 
satisfying the following properties: 

Assumption 2.4 / G M(E x U)+, and r G M(<9£ x U)+. 
Define for a > 0, t G R+, and U £U, 

3 a (U,t) = f e- as f{X(s),u(N(s),Z(s),r(s)))d S + f e- as r{X(s-),u d (N( S -), Z{s-)))dp* (s), 
Jo Jo 

oo 

where p*(t) = ' s ^I{T i <t}I{x(T % -)&dE} counts the number of times the process hits the boundary up 

i=l 

to time t and, for notational simplicity, set J(U, t) = J°(U,t). The long-run average cost we want to 
minimize over hi is given by: A(U, x) = lim^ +00 \E^ x ^ [J(U, t)] and we set Ja{x) = i^u^u A(U,x). 
For the a discounted case, with a > 0, the cost we want to minimize is given by: T> a (U,x) = 
E^ x Q ^[3 a (U,oo)] and we set J%{x) = mljj & uV a (U,x). We need the following assumption, to avoid 
infinite costs for the discounted case. 

Assumption 2.5 For all a > and all x G E, J%{x) < oo. 



E 

,«=i 



{Ti<t} 



< OO. 



2.2 Discrete-time relaxed and ordinary controls 

We present in this sub-section the set of discrete-time relaxed and ordinary controls. 
Consider C(U) equipped with the topology of uniform convergence and 7W(U) equipped with the 
weak* topology a(/\4(U), C(U)). For x G E, define V x (U) as the set of measures /i G V(U) satisfying 
fi(\](<j)(x, = 1. 7 7 (U) and V X (U) for x G E are subsets of .M(U) and are equipped with the 

relative topology. 

Let V T (respectively V r (x) for x G E) be the set of all r\- measurable functions \i defined on M + with 
value in P(U) such that n(t,U) = 1 ??-a.e. (respectively n(t, U(4>(x, t))) = 1 ??-a.e.). It can be shown 
(see sub-section 3.1 in |4() that V r (x) is a compact set of the metric space V r : a sequence (A i n) ngN in 
V r (x) converges to \x if and only if for all g G L 1 (IR + ; C(U)) 

lim / / g(t,u)fi n (t,du)dt = / / g(t,u)fi(t,du)dt. 

n ^°° Jr + JV{(f>(x,t)) ' JR+ JV(ct>(x,t)) 

The sets of relaxed controls can be defined as follows: V r (x) = V r (x) x V X \V), for x G E and V = 
V r x'P(U) . The set of ordinary controls, denoted by V (respectively V(x) for x G E), is defined as above 
except that it is composed of deterministic functions instead of probability measures. More specifically 
we have V(x) = {v G M(R+,U) : (Vt G R+) t u(t) G V((j)(x,t))} , V(x) = V(x) x U(0(a?,t»(a?))), 
V = M(R+, U) x U. Consequently, the set of ordinary controls is a subset of the set of relaxed controls 
V r (respectively V r (x) for x G E) by identifying any control action u G U with the Dirac measure 
concentrated on u. Thus we can write that V C V r (respectively V(x) C V r (x) for x G E) and from 
now on we will consider that V (respectively V(x) for x G E) will be endowed with the topology 
generated by Y r . The necessity to introduce the class of relaxed control V r is justified by the fact that 
in general there does not exist a topology for which V and Y(x) are compact sets. 
As in [la ], page 14, we need that the set of feasible state/relaxed-control pairs is a measurable subset 
of B(E) x B(Y r ), that is, we need the following assumption. 

Assumption 2.6 K = {(x,9) : G Y r {x),x G E) G £(£) x £(V r ). 

A sufficient condition is presented in [4J, Proposition 3.3] to ensure that Assumption 12.61 holds. 
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2.3 Discrete-time operators and measurability properties 

In this sub-section we present some important operators associated to the optimality equation of the 
discrete-time problem. We consider the following notation w{x,p) = / w(x,u)p(du) and Qh(x,fi) = 



h(z)Q(x,u; dz)p(du), and XQh(x, p) = / A(x,u) / h(z)Q(x, u; dz)p(du) for x G E, p G 'P(U) , 
ivJe _ ■' : ■' i: 

h G M(E) + and w G M(E x U) + . 

The following operators will be associated to the optimality equations of the discrete-time problems 

that will be presented in the next sections. For = (/i, pg) G V r , (x, A) G E x B{E), q£1, according 

to Lemma 2 in 111, Appendix 5] define 

A M (x,i) = [ \((f)(x,s),p,(s))ds 
Jo 

ft, (x) 

G a (x,Q;A) = / e- as - K " {x ^\QI A {4>{x,s)^{s))ds 
Jo 

+e -aU{ x )-K»{ xM x))Q^ t*(x)), fig; A). (1) 

For h G M(E)+, we define G a h(x,@) = [ h(y)G a (x,0;dy). For x G E, = (p,p d ) G V r , v G 
M(J5 x U)+ w G M(&E x U)+, Q£l, introduce 

L Q v(x,0) = [ U(X) e- as - A ^ x ' s) v((l>(x,s),fi(s))ds, (2) 



H a w(x, 6) = e-^W- AM ^*W)u;^(x,t,(x)),^). (3) 

For /i G M(E) (respectively, v G M(£? x U)), G a h(x, 0) = G a h + (x, 0) - G a h~~(x, 0) (respectively, 
L a v(x, 0) = L a v + (x, 0) — L a v~(x, 0)) provided the difference has a meaning. It will be useful in the 
sequel to define the function £ a (x,0) as follows: C a (x,Q) = L a lExv(x, ©)• In particular for a = 
we write for simplicity Go = G, Lq = L, Hq = H, Cq = C. Measurability properties of the operators 
G ' & , L a , and JL a are shown in [J, Proposition 3.4]. 

We present now the definitions of the one-stage optimization operators. 

Definition 2.7 Let a G R +; p G R, and h G M(E). Assume that for any x G E and T G V(x), 
—pC a (x, T)+L a f(x, T)+H a r(x, T)+G a h(x, T) is we// defined. The (ordinary) one-stage optimization 
operator is defined by 

T a (p,h)(x)= inf \-pC a (x,T) + L a f(x,T) + H a r(x,T) + G a h(x,T)\. 

Tev(x) L ) 

Assume that for any x G E and G Y r (x), —pC a (x, 0) + L a f(x, 0) + H a r(x, 0) + G a h(x, 0) is we// 
defined. The relaxed one-stage optimization operator is defined by 

K a (p,h)(x)= inf \-pC a (x,Q)+L a f(x,Q) + H a r(x,e) + G a h(x,Q)}. 

eev r (x) { > 

In particular for a. = we write for simplicity Tq = T, and IZo = 1Z. 

The sets of measurable selectors associated to (U(x)) E , (V(x)) s , (V r (x)) ^ are defined by 5u = 

{« G M(£,U) : (Vx G E),u(x) G U(x)}, 5 V = Uu,ug) G M(E,V) : (Vx G £), z/ 9 (x)) G V(x)}, 
<V = {(m, Me) G M(^,V) : (Vx G £), (/i(x), Ma (x)) G V(x)}. 



For a G M+, p G M, and u G M(F'), the one-stage optimization problem associated to the operator 
T a (p,v), respectively 1Z a (p,v), consists of finding a measurable selector T G Sy, respectively G <Sy 
such that for all x £ E, T a (p, v)(x) = —pC a (x, T)+L a f(x, Y)+H a r(x, T)+G a v(x, T) and respectively 
K a (p, v)(x) = -pC a (x, 6) + L a f(x, G) + H a r(x, 6) + G a v{x, 6). 

Finally we conclude this section by recalling (see Propositions 3.8 and 3.10 in [4,]) that there exist two 
natural mappings from Su to Sy and from Su to tl. 

Definition 2.8 For u G S\j, define the measurable mapping ua, of the space E into V by 
: x -> (u(4>{x, .)),u(4>(x,U(x)))) . 

Definition 2.9 For n G «So, define the measurable mapping U u , of the space NxEx M + into DxD 
fry U u , : (n,x,t) — »• [u((j)(x, t)), u(4>(x, i*(x)))) o/ i/te space NxJ?x M + into U x U. 

Remark 2.10 T/ie measurable selectors of the kindu^ as in Definition \2. ffl are called ordinary feedback 
measurable selectors in the class Sy C <Sy ana 1 i/ie control strategies of the kind U U(j> as in definition 
\2.9\ are called ordinary feedback control strategies in the class U. 



3 Assumptions and auxiliary results 

The purpose of this section is to introduce several assumptions (see sub-section I3.ip and to derive 
preliminary results that will ensure the existence of an optimal control for the long run average cost. 
More specifically, the two main results of sub-section 13.21 consist, roughly speaking, of providing a 
bound for J^{x) in terms of a (see Corollary 13. 13[) and of proving that the mapping defined by 
Jj^{-)-J]j{y) for y fixed in E belongs to M g (E) (see Theorem I3.17p . The results of sub-section 13.31 
are extensions of those presented in 0] to the case in which the functions under consideration are 
not necessarily positive (as it was supposed in [4() but instead belong to M g (E), It must be pointed 
out that these generalizations are not straightforward and are crucial for obtaining the existence of 
an optimal ordinary feedback control strategy for the long run average-cost problem of a PDMP. In 
particular, Theorem 13.221 states that for any function h G M g (E), the one-stage optimization operators 
1Z a (p,h)(x) and T a (p, h)(x) are equal and that there exists an ordinary feedback measurable selector 
for the one-stage optimization problems associated to these operators. 



3.1 Assumptions and definitions 

The next assumption is somehow related to the so-called expected growth condition (see, for instance, 
Assumption 3.1 in 15] for the discrete-time case, or Assumption A in [14] for the continuous-time 
case) used, among other things, to guarantee uniform boundedness of ^T§(x) with respect to a. 

Assumption 3.1 Suppose that there exist b > 0, c > 0, 5 > 0, M > and g G M ac (£') ; g > 1 
r G M.(dE), r(z) > 0, satisfying for all x G E 



sup { Xg[x) + cg{x) - X(x, a) [g(x) - Qg(x, a)] \ < b, (4) 
sup {f{x, a)} <Mg{x), (5) 



oGU(x) 



and for all x G E with t*(x) < oo 



sup {r((j>(x,U(x))) + Qg(4>(x,U(x)),a)} < g(<p(x,U(x))), (6) 
aeU(<f>(x,U (x))) 

\r(<f>(x,t*(x)),a)} < ——7tf(x,U(x)))- (7) 

t ^ J c + 



sup 

a&S((j>{x,U(x))) 
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Assumptions 13.21 13.31 and 13.41 presented in the sequel, are needed to guarantee some convergence 
and semi-continuity properties of the one-stage optimization operators (see sub-section I3.3D . and the 
existence of a measurable selector. 

Assumption 3.2 For each x G E, the restriction of X(x,.) to U(x) is continuous, fort G [0, t*(x)), 
t rt.*(x) 

sup \(4>(x, s), a) ds < oo and if t*(x) < oc then / sup X((f>(x , s) , a) ds < oo . 

aeV{(t>(x,s)) JO a£U(cj>(x,s)) 

Assumption 3.3 There exists a sequence of measurable functions (/j)jeN inWl(E xU) + such that for 
all y G E, fj(y, •) T f{v-> •) as 3 ~ * 00 and the restriction of fj(y, ■ ) to U(y) is continuous. There exists 
a sequence of measurable functions (rj)jen in M.(dE x U) + such that for all z G dE, rj(z, .) | r(z, .) 
as j — > oo and the restriction of rj(z, .) to U(z) is continuous. 

Assumption 3.4 For all x G E and h G ^>(E), the restriction of Qh(x, .) to V(x) is continuous. 
We make the following definition: 

Definition 3.5 Consider w G M(E) and h G M g (E). We define: 
Dl) u(w,h) G <Su as the measurable selector satisfying 

inf {f(x, a) — X(x,a) w(x) — Qh(x,a) } 

= f(x, u(w, h)(x)) — X(x, u(w, h){x)) w{x) — Qh(x, u(w, h)(x)) 



inf {r(z,a) + Qh(z,a)} = r(z,u(w, h)(z)) + Qh(z,u(w, h)(z)). 

aGU(z) 

D2) u^w^h) G Sy as the measurable selector derived from u(w,h) through the Definition W. 

D3) U^Wjh) £U as the control strategy derived from u(w,h) through the Definition \2.!A 

Notice that the existence of u(w,h) follows from Assumptions [3TTII3.4I and Theorem 3.3.5 in [lf|, and 
the fact that u^w, h) G <Sv> and U^w, h) follow from Proposition 3.10 in [4|. 

In the next assumption notice that for any u G Su, G(x,u ( f > ; .) can be seen as the stochastic kernel 
associated to the post-jump location of a PDMP. This assumption is related to some geometric ergodic 
properties of the operator G (see for example the comments on page 122 in 17] or Lemma 3.3 in [l5| 
for more details on this kind of assumption). 

Assumption 3.6 Suppose that there exist a > 0, 0<k<1 and for any u G Su there exists a 
probability measure v u , such that v u (g) < +oo and 

\G k h(x, U(t) ) - u u (h)\ < a\\h\\ g K k g(x), (8) 

for all h G M g (E) and k G N. 
The final assumption is: 

Assumption 3.7 There exist X G M(E) + , J G M(E) + , K\ G M+ such that 
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a) X(y, a) > A(y) and f(y, a) < f(y) for all y G E and a G U(y), 

ft* (x) 

b) / e ct-ftHHx,s))ds dt < j Qr aU x£Ej 

Jo 

c) lim e <*-tiM<t>M)ds = Q> f or aUxeE with t f x \ = +QO; 

d) lim e-Zo^.'M^^ra;,*)) = 0, for all x G E with Ux) = oo, 

e) / e-J5)^( a '.»))« to /(^(x,t))tit < oo. 







Remark 3.8 Notice the following consequences of Assumption 3.1: 



i) Assumption \3.7\ c) implies that G a (x,@;A) = / e~ as ~ AI *( x ' s ^ \QIa(4>{ x i s), [i(s))ds, and 



o 



H a w(x,Q) = 0, for any x £ E with = +oo, yl G B{E), a > —c, = (h,[j,q) G Y r (x), 

w G M(dE x U). 

Assumptions \3. 7| and imply that C a (x, 0) < if a /or any a > — c ; i £ £, 6 £ V r (x). 



3.2 Properties of the a-discount value function J^{-) 

The next two propositions establish a connection between a general intro-differential inequality (re- 
spectively equality) related to the local characteristics of the PDMP and an inequality (respectively 
equality) related to the operators G a , L a and H a . They will be crucial for the boundedness results 
on J^d(-) to be developed in the sequel. 

Proposition 3.9 Suppose that there exist v G M ac (£,M + ), £ G M(E) + , k G M(E) + , p G M(dE) + , 
= (/x, hq) G Syr, d > 0, and a > —c satisfying 

Xv{(j){x, t)) - [a + \(<f>(x, t),n(x, t))]v(<f>(x, t)) + £(<j)(x, t)) 

+ \(<j>(x, t),n(x, t))Qk(<t>(x, t),n(x, t)) < d, (9) 

for all x G E, t G [0, and 

v(<f>(x,t*(x))) > p(<j>(x,t*(x))) + Qk((p(x, t m (x)), iia(4>{x, t m (x)))), (10) 

for all x G E with t*(x) < oo. 
Then 

v{x) > -dC a {x,e(x))+LJ{x,e{x))+H a p(x,e{x)) + G a k(x,e(x)). (11) 

Proof: Multiplying both sides of equation Q by e ~ a *~ AM< ( X J) an d integrating over [0, s] for s G 
[0, t*(x)) we get that 



Jo Jo 

+ \(</>(x, t),n(x, t))Qk((/>(x, t),fi(x, t))) dt. (12) 
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Consider the case in which t*(x) < oo. By using the fact that v G M. ac (E), we obtain from Remark 
13.81 zz) and equation (I12p that 

v(x) > -d£ Q (x,6(x)) + L a ^(x,G(x)) +e- at *^- AMM ( x ' i *( a; »t;(0(x,^(x))) 

ft* (x) 

+ / e- at - K ^ x) ^ t] \{(t){x,t), ^{x^Qk^ix^^ix^dt. (13) 
Jo 

However, from equation (fl~0|) . it follows that 

v{x) > - dC a (x, &(x)) + L a £(x, Q(x)) + H a p(x, &(x)) + G a A;(x, Q(x)). 

Now consider the case in which t*(x) = +00. From equation (| 12j) (and recalling that v is positive), 
we have that 

d f S e- at - A " (x)(x ^dt>-v(x)+ [ S e- at - AKx) ^[£(<j>(x,t)) 
Jo Jo 

+ \((f>(x, t),fi(x, t))Qk((f)(x, t),/j,(x, t))] dt, 
and so, by taking the limit as s tends to infinity in the previous equation, it yields 
v{x) > -d£ a (x,Q(x)) + L a £{x,Q(x)) 

pt* (x) 

+ / e-^-^^ \(<P(x,t), LL{x,t))Qh{(/)(x,t), v(x,t))dt. 
Jo 

However, by using the fact that t*(x) = +00 and Remark 13.81 i). we have that H a p(x,Q(x)) = and 

G a k{x, B(x)) = / e- at - AKx) ^X((j)(x, t),n(x, t))Qk(<f>(x, t), n(x, t))dt, showing the result. □ 
Jo 

If the inequalities in (|9|) and (fTUl) are replaced by equalities then the hypotheses of Proposition 13.91 
must be restricted to a > to show that the inequality in (fTT|) becomes an equality, more specifically, 
we have the following result: 

Proposition 3.10 Suppose that there exist v G M™(E,R + ), £ G M(E)+ , k G M(E)+ , p G M(dE)+ , 
= (/u, hq) G Syr, d > 0, and a > satisfying 

Xv(4>(x, t)) - [a + \(<f>(x, i),n(x, t))]v(4>(x, t)) + £{(f>(x, t)) 

+ \(<f>(x, t),fj,(x, t))Qk((j)(x, t),n(x, t)) = d, (14) 

for all x G E, t G [0, and 

v(</>(x,t*(x))) = p((f>(x,t*(x))) + Qk((f)(x,U(x)), (j,g((f)(x,U(x)))), (15) 

for all x £ E with t*(x) < 00. 
Then 

v{x) = -dC a (x,e(x))+L a £(x,e(x))+H a p(x,e{x)) + G a k(x,e{x)). (16) 

Proof: By following the same steps as in the first part of the proof of Proposition 13.91 we have that 
for all s G [0,t*(x)) } 

d f e- at - AKx) ^dt = e-^-^^v^x, s)) - v(x) + f e -«*-A"<-> OM) t )) 
Jo Jo 

+ X((j)(x, t),n(x, t))Qk{<j){x, t),n(x, t))} dt. (17) 
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The case in which t*(x) < oo can be treated in the same manner as in the proof of Proposition 13.91 
However, the case in which t*(x) = +00 is different. By using Assumption 13.71 d) and the fact that 
< v < \\v\\ g g, we have that for any a > 0, 

lim e~ as - A ^ x) ^v{(/)(x,s)) <\\v\L lim e~^ W ^ x ^ dt gU{x, s)) = 0. 

S^ + OO S— + + OO 

Therefore, taking the limit as s tends to infinity in equation (|17p . we have that 

dC a {x, @{x)) = - v{x) + L a £(x, 9(a)) 

+ f e-^-^^X^x, t),fj,(x, t))Qk(cf>(x, t),n(x, t))dt, 
Jo 

and this shows equation (|16[) by using Remark 13.81 i). □ 

Applying Proposition 13.91 to the inequalities (JJ|) and (JSJ) we obtain the following corollary: 
Corollary 3.11 For any u £ «Su, a > — c, and x £ E 

g(x) > -bC a (x,u</,(x)) + (c + a)L a g(x,u,j ) (x)) + H a f(x,u^{x)) + G a g(x,u^(x)), (18) 
and for all O 6 Syr 

(c + a)L a5 (x, 6(s)) + H a r(x, G(x)) + G a fl(x, 6(x)) < 6K A + 5 (x). (19) 

Proof: Clearly from Proposition 3.8 and Remark 3.11 in [J], it follows that £ Syr. Consequently, 
setting d = b, v = g, I = (c + a)g, p = r, k = g, and = in Proposition 13.91 we get equation (Tl8|) , 
Similarly, from Remark 13.81 n). the inequality (|19p is a straightforward consequence of the inequality 
(MD. □ 



The next theorem provides bounds in terms of a and g for a sequence of functions defined by a 
general recursive equation and for the functions Lf, Hr and Lg. 

Theorem 3.12 Define the sequence (q m (x))meN by 
<£(x) = 0, 

C+i(z) = L a f(x,u™ + \x)) + H a r{x,u™ + \x)) + GaCfe^W), (20) 

where x £ E, (u m ) m ^ £ 5u and a > 0. 
JTien i/ie following assertions hold: 

i) for any x £ E, m £ N and a G [0, 5), we have that 

r. , x M , , Mb 

+ (21) 
c + a ca 



ii) for any x £ E, u £ S\j, 



< Lf(x, u^x)) + Hr{x, U<t> {x)) < M(1 - bKx) g{x), (22) 
< Lg(x, u+ix)) < {1 + bKx) g(x). (23) 
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Proof: Let us show (|2ip by induction. For m = it is immediate since = 0. Suppose it holds for 
m. Combining (|20f) and (|2ip we have 

M Mb 
Ci 1 < 0*0 + ff«r (z,t# x + x + — G a l x,t# x) . (24) 

Moreover, from equations (fTHj) and (fT^|) . we obtain that 

G a g(x, v$(x)) < g(x) + bC a (x, t#(x)) - (c + a)L a <?(:r, u^(x)) - H a r(x, u$(x)). (25) 
Replacing ([25]) into PH) and using © and (|7|), we get 

Cfi(*) < £«(/ - M ff )(x,u^(x)) + F a (r - -^-r)(x,^(x)) + -^-<?(x) 

c + a ^ c + a 



+ M&(— G Q l(x, U £(x)) + —C a {x, v%(x)) 
\ca Y c + a v 

-g(x) + — (g q 1(x, v%{x)) + aC a (x, uT(x)) 



~ c + , 

M , . Mb 

< ——g{x) + (26) 

c + a ca 

since that G Q l(x,u™(x)) + aC a (x, it™(x)) = 1. 

Let us show now (|22|) and (|23p . For a = it follows from Remark 13.81 ii) and equation (j!8j) that 

g(x) + 6ET A > sfc) + bC(x, u^{x)) > cLg(x, u<j,(x)) + Hr(x, u<j>(x)) + Gg(x, u<j,{x)), (27) 

showing equation (|23|) since <? > 1 and r > 0. Now, combining equations ©, ([7]) and (|271) we get (|22l . 
showing the last part of the result. □ 

Based on the previous result, we obtain the following corollary showing that the a-discount value 
function Jg{-) belongs to ~B g (E) and providing a bound for Jg{x) in terms of a. 

Corollary 3.13 For any a > and x £ E, 

. M . . Mb 
JS(x) < ——g(x) + — . (28) 
c + a ca 

Proof: By using Propositions 7.1 and 7.5 in [J], it can be shown that there exists u™ € <Sy such that 
the sequence (?j^(i)) meN defined by v^ +1 (x) = L a f(x,u™(x)) + H a r(x,u™(x)) + G (x)) 
and Vq(x) = satisfies dJJ, +1 j <Jd( x ) as m T 00 • Therefore, considering = u m i n Theorem 13.121 
and taking the limit as m \ 00 we get (I28p . □ 



The following technical lemma shows that Jgix) can be written as an infinite sum of iterates of the 
stochastic kernel G a . Using this result, Jgix) is characterized in terms of the Markov kernel G in 
Proposition 13.151 This is an important property. Indeed, by using classical hypotheses on G such as 
the geometric ergodic condition in Assumption 13. 6] it will be shown in Theorem 13.171 that the mapping 
defined by Jg{-)-Jg{y) for y fixed in E belongs to M g (E). 

Lemma 3.14 For each a > there exists u a € Su such that 

00 

Jg(x) = J2G k a (L a f + H a r)(x,u%(x)). (29) 

k=0 



12 



Proof: As shown in 0, Theorem 7.5], J% G M(E) and 3%{x) = K a (0,Jg)(x). Moreover, from 
Theorem 6.4 in 0], there exists u a G Sn such that the ordinary feedback measurable selector G <Sv 
satisfies 

JB(x) = K a (0,JS)(x) = T a (0,Jg)(x) = L a f{x,v$)(x)+H a r(x,v%) + G a J%{x,u%). (30) 
Iterating (|30|) and recalling that J%{y) > for every y, yields for every m G N that, 



m— 1 



TO— 1 



J7g(x) = 2 G*(£«/ + #«r)(s, + G™7£(x, u£(x)) > £ G^,(L a f + H a r)(x, u^(x)). (31) 

fc=0 fc=0 

For the control f7 u a g W (see Definition I2.9|) , it is easy to show that 
m— 1 



0*,o) 



fc=0 







s /(X( S ),n(7V( S ),Z( S ),r( S )))d S 



+ / e 
/o 



5 r(X(s-),u a (iV( S -),Z( S -)))dp*( f 



, (32) 



JJ„ a 



where U u a = (u,uq) From Assumption 12.31 T m — > oo, P "<?• a.s. Therefore from the monotone 

CO 

convergence theorem, equation (j32j) implies that G k a (L a f + H a r)(x, u^(x)) = T> a (U u a,x), and 

A:=0 

from equation (j3T|) 

CO 

J3(x) >Y, G a( L »f + H a r)(x,u^(x)) =V a (U u ^,x). (33) 

fe=0 

But since U u <* G W and Jg(x) = inf V a (U,x) it is clear that V a (U u «,x) > Jg(x), so that §3$) yields 
(1501). " □ 



The next proposition gives a characterization of J]j{x) in terms of G. 

Proposition 3.15 For a > and as in Lemma \3.14\ define the sequence (sJJ,(a;)) meK for x G E 
by s%(x) = and s m+1 (x) = L a f(x,u^(x)) + H a r(x,u^(x)) + G a s^(x, u^(x)). T/ien 

m 

Jg(x)= lim ^G fe (L(/-Q Sm+1 _ ifc ) + ^r)(x,^(x)). (34) 

m— »co ^— » r 

k=0 

Proof: By definition for all m G N, G M(£) and s° +1 (x) = ££1 G a( L af + H a r)(x,v%(x)) and 
clearly from Lemma 13,14} we have that s^ | Jg as m | oo. Applying Lemma 9.2 in [4|, it can be 
shown that G M ac (£') and for all x G E, and f G [0,£*(x)), 



,-as-fg \(cf>(x,e),u a ((j>{x,8)))d8 



s m+l( X ) — I e 
JO 

+ \(</>(x, s),u a {<p{x, s)))Qs^{(t>{x, s),u a {<p{x, s))) 



/^(x, S ),n Q (0(x, S ))) 
ds 



implying that 



*Cf i(*0 - [« + A (*> « a (»))] C+i(*) + /(*, ^0*0) + « Q (x))Q Sm (x, u Q (x)) = 0. (35) 
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Consider the case in which < oo. Since s", 1 € M. ac (E), this yields that 



m+l 

= L a f{x,u%(x)) + e -^'( a, )-J'o* ( " ) ^(*.»).« a W*.-)))* s ^ +1 (0(x ) ^(x))) 
+ / e-^-Zo ^' fl )' ua W a: ' fl )))' ifl A(^(x,s) ) u a (^(x,s)))Qs^(^(x,s),u a (^(x,s)))ds. (36) 



From Assumption 13.21 we have that e ^o* ' K4>{x,s),u a {4>{x,s)))ds > g. Therefore, combining the defini- 
tion of and equation (|3"6"]). it gives 

*m+i(0(M*(aO)) = Q<(0(x,t*(x)),«(</>(x,i !|t (x)))) +r(0(x,t*(x)),«(0(x,t*(x)))). (37) 

Using Proposition 13.101 we get from ([35]) . (|37|) that 

= i(/ " c< + i)(*,<(x)) + ffr(x,u?(x)) + Ga« (x,^(x)). (38) 

Iterations of (f38l) over m yields (j34"l) . □ 



Before showing that the mapping defined by J^{)-J^){y) for y fixed in E belongs to M g (E), we need 
to prove that the mapping L(/ — as^ +1 )(., u^(.)) + Hr(.,u^(.)) belongs to M g (E). 

Lemma 3.16 Define M 1 = Mf ~ 1+ c^ 1+bK ^ _ p or a > 0, as in Lemma \3.14[ as in Lemma \3.15[ 

and x € E, we have that 

\L(f-as^ +1 )(x,u%(x))+Hr(x,u%(x))\<M'g(x). (39) 

Proof: Notice that 

- aLs^ +1 (x, u%{x)) < L(f - as^ +1 )(x, u%{x)) + Hr(x, v$(x)) < Lf(x, v%(x)) + Hr(x, u%{x)). (40) 



Considering = in Theorem 13.121 and recalling that g > 1 we get from equation (I2ip that 

4(I) <* w + ^<M!l±|) sW . (41) 

c + a ca a 

Therefore from (|4ip we have that asJJ, < M(l + -)g and thus, from (|23[) . 

a£ S « +1 (x,„;(x)) < Bl±M±^ g{x) , (42 ) 



By combining equations (I22p . (j40H and (|42p the result follows. □ 

Finally, it is shown that J^{-)-J^{y) for y fixed in 2£ belongs to M g (E). 
Theorem 3.17 For any a > and (x,y) G i? 2 

I JB (x) - (y ) | < f^- ( i + g(y ) )g(x) . (43) 

Proof: From Assumption 13.61 and Lemma 13.161 we get that for all x € E, 

G k (L(f - as a m+1 _ k ) + Hr)(x, u%{x)) - vr u . (L(f - as^ +1 _ k ) + Hr)\ < aM' K k g(x). 
Consequently, 

m 

Y,G k W - «Cfi-k) + Hr){x,u%{x)) - G k (L(f - as a m+1 __ k ) + Hr){y,u%{y)) 

k=0 

<aM\g{x)+g{y))- 



1-K 

Taking the limit as m f oo in the previous equation and recalling that g > 1 we get the desired result 
from Proposition 13.151 □ 
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3.3 Convergence and semi-continuity results 

The main goal of this sub-section is to show that there exists an ordinary feedback measurable selector 
for the one-stage optimization problems. First we present in the next two results some convergence 
and semi-continuity properties of G a , H a , L a and C a . 

Proposition 3.18 Consider a G R+, a non increasing sequence of positive numbers {otk} with at I a, 
and a sequence of functions (^fc) fcgN £ ^>g{E) such that there exists satisfying \hk(x)\ < Kf l g(x) 
for all k and all x G E. Set h = lim h k - For x G E, consider Q n = (iJL n ,Hd,n) G Y r (x) and 

k— »oo 

= G Y r (x) such that Q ra — > 0. We have the following results: 

a) lim C an (x, G n ) = C a (x, G), b) lim L an f(x, Q n ) > L a f(x, Q), 

n—>oc n — >OQ 

c) lim H an r(x,@ n ) > H a r(x,Q), d) lim G an h n (x,Q n ) > G a h(x,Q). 

Proof: The proofs of a), b), c) are the same as in Proposition 5.7 in [1]. It only remains to show 
d). Set hk = h). + K^g, h = h + K^g and g k = infj>fc hj (thus g k j h and g k < h n for n > k). By 
hypothesis, gk(y) > for all y G E. We have that g k is the limit of a nondecreasing sequence of 
measurable bounded functions g k ^ G M(E). Set X m (y,a) = m A A(y, a). From Assumptions 13.21 and 
13.41 we have that for each k, i, m and y G E, X m Q9k,i(y, •) is continuous on U(y). Assumption 13.71 
and the fact that for each k,i, g k ^ is bounded above by, say M k) u yields that 

ft* (x) 

0< / e-JZaWM))* sup (\ m Qg kji (J>(x,t),a))dt <mM kji K x . 

JO aeV(</}(x,t)) 

Since (X m Qg kj i)(y,a) > and it is continuous in a we have from b) that lim L an (X m Qg k! i)(x 1 Q n ) > 
L(X m Qgk,i)(x, G), and thus, recalling that g kj i < h n for n > k and X m < A, 

lim L an (\Qh n )(x,Q n ) > L a (X m Qg k:i )(x,@). 

n—*oo 

From the monotone convergence theorem and taking the limit over m, i, k we get that 

lim L an (XQh n )(x,Q n ) > L a {XQ~h)(x,Q). (44) 

n^oo 

By using the same arguments as above, it can be shown that 

lim L an (XQg)(x,e n ) > L a (XQg)(x, G). (45) 

n— >oo 

Moreover, from equation (fT9j) . we have for any v G M g (E) that \G a v(x, Q)| < H^H^WTa + 9(%)) for all 
x G E and G G V, and hence 

lim L an (XQh n )(x, Q n ) = lim L an (XQh n )(x, Q n ) + K h lim L Qn (XQg)(x, Q n ). 

Similarly 

L a (XQh a )(x, G) = L a (XQh a )(x, G) + K h L a (XQg)(x, G). 
By combining equations (|4"4"|) and ([43]) we get that lim L an (XQh n )(x,Q n ) > L a (XQh)(x,Q). Using 
similar arguments as above and c) we can show that 

lim H an h n (x,Q n ) > Hh(x,@), 

n— >oo 

completing the proof of d). □ 
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Corollary 3.19 For x E E, and h E M g (E), £ a (x,@) is continuous onY r (x) and G a h(x,Q) (respec- 
tively, L a f(x,Q), H a r(x,@)) is lower semicontinuous onV r (x). 



Proof: By taking = a > 0, hf. = h in Proposition 13.181 the results follow. 



□ 



The next two technical lemmas will be used to derive the main result of this sub-section, which is 
Theorem 

Lemma 3.20 Let a > 0, p E R+, h E M g (E) and set w = lZ a (p, h). Then there exists Q £ Syr such 
that 



w (x) = -p£ a (x, G(x)) + L a f(x, Q(x)) + H a r(x, Q(x)) + G Q h(x, Q(x)) 
Moreover, w E M ac (£') ; and satisfies for all x E E and t E [0, t*(x)), 

-as— A M (x,s) 

~P + f{4>(x, s),p(x, s)) + XQh((p(x, s),fi(x, s)) 



(46) 



w(x) = inf < / i 

fJ.£V r (x) { Jo 



~P + f(<t>(x, s),p(s)) + \Qh(<j)(x, s),p(s)) 



ds 



+ e" 



ds 



+ e 



w(cj)(x,t)), 



(47) 



(48) 



where Q(x) = (p(x), fig(x)). 

Proof: From Corollary 13.111 it follows that the mapping V defined on /C by 

V(x, 9) = -p£ a (x, G) + L a f(x, Q) + H a r(x, G) + G a h(x, G) 

takes values in R. Moreover, from Assumption 12.61 and Proposition 3.4 in [4], it follows that V is 
measurable. Furthermore, by using Corollary 5.8 in [4] it follows that for all x E E, V(x, .) is lower 
semicontinuous on Y r (x). Recalling that V r (x) is a compact subset of V r and by using Proposition 
D.5 in [lg], we obtain that there exists G E Syr such that equation ([4*6]) is satisfied. The rest of the 
proof is similar to the proof of Proposition 4.2 in [4] and it is therefore omitted. □ 



Lemma 3.21 Let a > 0, p E R+ and h E M g (E). Then, for all x E E 

n a (p, h)(x) > -(p + b\\h\\ g )K x - \\h\\ g g(x), 
and for all x E E such that t*(x) = oo and Q = (p, pg) E Y r (x) 
-p£ a (x,Q) + L a f(x, Q) + H a r{x, G) + G a h(x, Q) 



lim 



-os- A^(x,s) 



~P + f{4*(x, s),p(s)) + \Qh((j)(x, s),p(s)) 



ds. 



Proof: From equation (I19|) we have 



G a g(x,Q) < bK x + g(x), 



(49) 



(50) 



(51) 



for all x E E and G E V r . Consequently, by using equation (I46h and the fact that / > and r > 
it follows that 1Z a (p,h)(x) > —p£ a (x,Q(x)) + G a h(x, G(x)) > —(p-\-b\\h\\ g )K\ — \\h\\ g g{x), showing 
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the first part of the result. 

From Assumptions 13 . 71 a) . b) and e), we have that lim / e -<**-W(*<*) \- p+ f^ x 8 ) M ( s ))l ds exists 

in R, and from equation ([51]) . lim / e~ as ~^( x ' s ' XQg((f>(x, s), p(s))ds exists in M. By using the fact 

+°°Jq 

that /i € M g (E), it follows that the limit on the right hand side of equation (|50p exists. Finally, from 
Remark 13.81 i) we get the last part of the result. □ 

The next result shows that for any function h G M g (E), the one-stage optimization operators TZ a (p, h)(x) 
and ToX.Pi h)i x ) are equal and that there exists an ordinary feedback measurable selector for the one- 
stage optimization problems associated to these operators. 

Theorem 3.22 Let a > 0, p G R + , h G M g (E) and set w = K a (p,h). Then w G M ac (E) and 
the ordinary feedback measurable selector u^(w,h) G Sy (see item D2) of Definition \3. 5] ) satisfies the 
following one-stage optimization problems: 

TZ a (p, h)(x) = T a (p, h) (x) 

= -pC a (x,u<p(w,h)(x)) + L a f(x,u ( j ) (w,h)(x)) + H a r(x,u ( f ) (w, h)(x)) 

+G a h(x,u ( f > (w,h)(x)). (52) 

Proof: According to Lemma 13.201 there exists O G Syr such that for all x G E and t G [0, i*(x)) we 
have 



e-^- AAW ^M<KM)) -w(x) /V»-AAM(«,,) 



/ 

Jo 



P~ f(<P(x,s),p(x,s)) 



-\Qh(4>(x,s),p(x,s)) 



ds, (53) 



where 0(x) = (£l(x), £lq(x)). Since u> G M ac (£'), we obtain from equation (|53l) that 

Xw{<j)(x, t)) - [a + A(0(x, t), i))]io(0(sc, t)) = -f(<f>(x, t),ft(x, t)) - XQh(4>(x, t),p(x, t)) + p, 

r} — a.s. on [0,t*(x)), implying that 

—Xw{<f>{x, t)) + aw((f)(x, t)) 

> inf |/(0(a;,t),^) - \(<f>(x,t), n)w((f>(x,t)) + XQh(<j)(x,t), p,)\ -p. 

However, notice that 

inf { *), M) - t),p)w(<j>{x, t)) + XQh(4>(x, t), p) \ - p 

= inf NN { f{<t>{x,t),a) - X(<j)(x,t),a)[w((f>(x,t)) - Qh((p(x,t),a)] \ - p. 

Consequently, by considering the measurable selector u G <Su given by u = u(w, h) (see Definition 13.51 
Dl)), we have that 

— Xw((j>(x, t)) + aw(x) 

>~P + f(<t>(x, t)MHx, t))) - A(0(x, t),u(Hx, t))) [w{(j){x, t)) - Qh(cp{x, t),u(Hx, *)))] , 
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i] — a.s. on [0, t* (x)) implying that 

-Xw(4>{x,t)) + aw((f>(x,t)) = -p + f((f)(x,t),u((f)(x,t))) 

-\{<j>(x, t),u((j)(x, t))) [w{<f>(x, t)) - Qh(cj>(x, t),u(Hx, *)))] . 

T) — a.s. on [0, t*(x)), otherwise this would lead to a contradiction with equation (I47p . Consequently, 
for all t G [0,t*(x)) it follows that 

w ( x ) = e -^ t+7[ ^w^(x,t)) + e-( as+ ^ x ' s »\f((/)(x,s),u{Hx,s))) 



+ \((f)(x,s),u((f)(x,s)))Qh((f)(x,s),u((t)(x,s))) - p ds, (54) 

where we set A(x,t) = / \(4>(x, s), u((f>(x, s)))ds. 

Jo 

First consider the case in which < oo. We obtain, by taking the limit as t tends to i*(x) in 

the previous equation, that the ordinary feedback measurable selector u^w, h) G Sy (see item D2) of 
Definition I3J3I) satisfies: 



w ( x ) =e-( at *W +A ( x > t *W»w( ( f>(xMx))) - pC a {x,u^w,h){x)) + L a f(x,u4w,h)(x)) 
rU(x) _ 

+ / e- {as+A(x ' s) h(^{x,s)M^,s)))Qh{(l)(x,s)MHx,s)))ds. (55) 



Define the control O(x) by (p(x),p) for p G V(\J{4>{x, t*(x)))) . Therefore, we have that 

ft* (x) 

w(x) < - pC a (x, G(x)) + L a f(x, G(x)) + / e- as - mx){x ' s) \Qh{<i){x, s),p(x, s))ds 







+ e -at*(x)-AK*) ixMx) ) [Qh^ t*(x)),p) + r(<f>(x, t*(x)),p)] . (56) 

From equation (|48p . we have that 

r-t r 

w {x)= / e -^-A^)(x,s) _ A , + /(0( a . jS ) j ^( z>a )) + AQ/i(<Kx,s),/i(x,s)) 

jo L 

+ e -«t-AA(-) (a!)t ) w(0(a . )i)) _ 
Since to G M ac (£'), this yields that 

w(x) = - pC a (x, Q(x)) + L a /(x, 6(x)) + [ U(X) e- as - mx){x ' s) XQh{(t){x, s),p(x, s))ds 



+ e -<*t.(x)-AH*)(xMx)) w ( < /,fa t ,(x))). (57) 

From Assumption 13.21 we have that g-A* 1 x (a;,t»(a;)) ^ q Therefore, combining equations (|56|) and (157)) . 
it gives that for all x G E and p G P(U((/>(x, i*(x)))) , 

w(4>(x,U(x))) < Qh((f)(x,U(x)), p) + r((f>(x,U(x)), p). 



Clearly, by using equation (1461) . it can be claimed that the previous inequality becomes an equality 
for p = po(x), implying that 

w{4>(x,U(x))) = inf {r(4>(x,U(x)),p) + Qh((f){x,U(x)),p)} 

neV(U(<j>(x,u(x)))) 

= , N , N { r (^ > ( x ' t *( x ))' a ) + QK4>{x,U(x)),a)}. 

a&S((f>(x,t* (x))) 
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Consequently, we have that 

w(4>(x,U(x))) = r((j)(x,U(x)),u((j>(x,U(x)))) + Qh(4>(x,t*(x)),u(4>(x,t*(x)))). (58) 
Combining equations (I55p and (|58p . it follows that 
w(x) = - pC a {x,u^(w,h)(x)) + L a f(x,u<f,(w,h)(x)) + H a r(x,u ( p(w,h){x)) + G a h(x,u<f,(w, h){x)). 

Consider now the case in which t*{x) = oo. By using equation (j54"l) and (Tl9l) we obtain that 

w{x) > _ e -(«t+7C(x,t))^ + 6| | ft y ifA+ I^H^^i))] + r e -(-+^))[/(^, s )^(0( x , s ))) 

JO L 

+ \(4>(x, s), u(4>(x, s)))Qh((j)(x, s), u((j)(x, s))) — p ds. (59) 
However, from Assumptions ETT1 a) and d) we obtain that 

hm e-^+ X ^))[(p + 6||h|| g )^ A + ||h|| g5 (0(x,t))] =0. (60) 

Consequently, combining equations (f5T)j) . (f^|) and ([BTI]) . the ordinary feedback measurable selector 
u^Wjh) € 5v satisfies: 

u>(x) > - p + C a (x,u^{w, h)(x)) + L a f(x,u ( f ) (w,h)(x)) + H a r(x,u <j) {w,h)(x)) 
+ G a /i(x,U0(w, h)(x)). 

By using equation (fTFj) it follows that the inequality in the previous equation is in fact an equality. 

In conclusion, since V(x) C Y r (x) it follows that 1Z a (p,h)(x) < T a (p, h)(x). However, we have 
shown that u^w, h) 6 Sy satisfies 

TZ a (p,h)(x) = -p£ a (x,u<j>(w,h)(x)) + L a f(x,u ( f,(w,h)(x)) + H a r(x,u ( f ) (w, h)(x)) 
+G a h(x,u ( / > (w, h)(x)), 

which is the desired result. □ 

4 Main results 

It has been shown in a previous work of the authors (see Theorem 6.2 in [J]) that if there exists 
(p, h) G R + x M(-E) with h bounded from below satisfying the discrete-time optimality equation 

T(p, h)(x) = h(x), and the technical condition lim - lim EY n \ h(X(tAT m )) = 0, for all U G U, 

t^r+oo t m^+oo V J -> u ,i L 

then there exists an ordinary feedback optimal control strategy U for the long run average-cost problem 
and morevoer p = Ja{ x ) = A(U,x). However, it is hard to obtain a solution for the discrete-time 
optimality equation, T(p, h){x) = h(x). A classical method to deal with this difficulty is to follow the 
so-called vanishing discount approach in order to show that there exists (p, h) G K x M.(E) with h 
bounded from below satisfying an optimality inequality of the kind h > T(p, h). By using the fact that 
h is bounded from below, the previous inequality leads to the existence of an optimal control. In this 
context, a classical hypothesis (see for example Assumption 5.4.1 in 0, page 86]) is to assume that 
the difference of the a-discount value functions <Jjj{-) — J§(xo) is bounded from below. This approach 
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has been developed in 0, Theorem 8.5] to ensure the existence of an optimal ordinary feedback control 
strategy. 

As shown in Theorem 13.171 of sub-section 13. 2\ the hypotheses made in sub-section 13.11 yields that the 
difference of the a-discount value functions J^{-)—J r ]{xo) is not necessarily bounded from below. This 
result implies the existence of a pair (p, h) satisfying h > T(p, h) where p G M + but with h G M g (E). 
Consequently, the result presented in [4j cannot be directly used. The idea to overcome this difficulty 

lim" - lim E U yjT a (p,h)(X(t AT m )) 



is to show in Proposition 14.41 that for u G S\j, lim — lim E, t. Ta{p-,h){X(t A T m )) > in order 

t^+oo t m— »oo \ x ^) L ' . 

to obtain in Theorem 14.51 the main result of this paper, which is the existence of an optimal ordinary 
feedback control strategy for the long run average-cost problem of a PDMP. 



First we need the following auxiliary result: 

Lemma 4.1 Consider an arbitrary u G S\j and let and U U(t> be as in Definitions \2.8\ and \2.9\ 
respectively. For all x G E define g(x) = —bC- c (x, u^x)) + G- C g(x, u^x)). Then g G M g (E) and U Ut/> 
satisfies 



E 



Uu 4 
(x,0) 



g(X(t A T m ))l < e~ ct g(x) + - [l - e~ ct ] + a\\g\\ g g{x)K m + \\g\\ g v u {g) + bK x . 



(61) 



Proof: From (|18p with a = — c and recalling that r{z) > we obtain that —bC^ c (x,u ( f,(x)) + 
G- C g{x,u ( j ) {x)) < g(x). Clearly, g G W1(E) is bounded from below by —bK\ from Assumption 
13.71 b) and thus g G M g (E). Since g G W1(E) is bounded from below, it is easy to show that 



-hE^t J(J ATm e cs ds + E^y e c( - tATm ^g(X(t A T m )) < g(x), by using the same arguments as in 
the proof of Proposition 4.4 in [4] • Combining Fatou's Lemma and Assumption 12.31 we obtain that 



E 



(x,0) 



Clearly, we have E 



Uu 4 
(*>o) 



quently, we get E^fo 



g(X(t A T m )) 
g{X(t A T m )) 



g[X(t))\ <e- ct g(x) + -[l-e 



-ctl 



El" 



L {t<T r 



i{t>T m }g{x{T m )) 



(62) 



. Conse- 



< E 



(x,0) 



g(X(t)) + G m g(x, u^(x)) + bK\ by recalling that g is 



bounded from below by —bK\. The result follows by using Assumption 

We have the following propositions showing that there exists (p, h) G 
optimality inequality h > T(p, h) is satisfied: 



and equation (f6"2"j) . □ 
_ x W> g (E) such that the 



Proposition 4.2 Set p a = aJ^{xo) for a fixed state xq G E. Then there exists a decreasing sequence 
of positive numbers otk j such that p af . — > p and for all x G E, \\m.k^oo a.kJ ah (x) = p. 



Proof: From equation (|28|) . we obtain that there exists (3 > 0, C > 0, such that for a G (0, f3], p a < C. 
By using the lemma on page 88 in [if]], the result follows. □ 



Proposition 4.3 Set h a { ) = J^{-)—J^{xq) forxo G E as in Proposition 4-2 and write h = lim h ak . 

k— »oo 

Then for all x G E, h G M g (E) and h(x) > T(p, h)(x). 

Proof: From Proposition 7.1 and Theorem 7.5 in [4| we have that the following equation is satisfied 
for each a > and x G E: 



h a (x) = T a (p a , h a )(x) 

= -p a C a (x,u^(x)) + L a f(x,v%(x)) + H a r(x,u%(x)) + G a h a (x, u%(x)), 



(63) 
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for £ Sy. For x G E fixed and for all k E N, u^ k {x) 6 V(x) C V r (x) and since V r (x) is 
compact we can find a further subsequence, still written as u^ k (x) for notational simplicity, such that 
u^ k {x) — > 9 G V r (x). Combining equations $33]), (f63l) and Proposition 13.181 



= lim {-p Qfe £ afc (x,u^(x)) + L afc /(x,u^ fc (x)) + H ak r(x,u^ k (x)) + G ak h ak (x,u^ k (x))\ 

k— >oo J 

> -p£(x, 9) + Lf(x, 9) + #r(x, 9) + G7i(x, 9). (64) 
Therefore, from Theorem 13.221 it follows that 

h(x) > K{p, h)(x) = T(p, h)(x) 
showing the result. □ 

From now on, p and h are fixed as in Propositions 14.2 1 and I4.3| and set u = u{T{p, h),h). Clearly, 
it satisfies the following one-stage optimization problems: 



K(p,h)(x) = T(p,h)(x) 

= -pC(x, u$(x)) + Lf(x, u^x)) + Hr(x, u$(x)) + Gh(x, u^{x)). 



(65) 



We need to show that lim^ +00 \ limm^oo E^ x 'L 
provides this result. 



T(p, h)(X(tAT m )) 



> 0. The next proposition 



Proposition 4.4 For all x G E, E^ x ^ 



lim — lim E, n s 

t— >+oo t m^oo \ X )' J ) 



T(p,h)(X(tAT m )) 



is well defined and satisfies 



T(p,h)(X(tf\T m )) 



> 0. 



(66) 



Proof: By definition, we have that T(p, h)(x) > —pC(x,u ( j ) (x)) + Gh(x,u ( p(x)). Therefore, using the 
definition of g in Lemma 14.11 with u = u we obtain that 

T{p, h){x) >-(p + b\\h\\ g )K x - \\h\\ g g(x). (67) 

Consequently, combining equations (f6Tj) and ([67]) we get that the negative part of T(p, h) [X(t A T m )) 
is integrable implying that E^'L T{p,h){X{t A T m )) is well defined, and that (|66|) holds, showing 



the result. 



□ 



The next theorem, which is the main result of this paper, shows that the ordinary feedback control 
f/g , is an optimal strategy for the long run average-cost problem of a PDMP. 

Theorem 4.5 For all x € E, 



P = Ja(x) = A(Uu,,x). 



Proof: Define 



v., 



J u <t> (f „\ -p "<A 



tAT„, 



f(X(s),u(X(s))) - p 



ds 



+ 



tAT„ 



{X(s-),u d (X(s-)))dp*(s) + T(p, h)(X(t A T m )) 
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From Proposition 14.41 we have that ^r^o) ^(P> h){X(tAT m )^ is well defined. Consequently, following 

the same arguments as in the proof of Proposition 4.4 in (ij, we can show that Jm^ (t,x) < h(x) for 
all m G N, (t, i)eR + x E. Therefore, 



E * 



t/\T„ 







f{X(s),u(X(s))) 



ds + 



tAT m 



r{X(s-),u d (X(s-)))dp*(s) 



4- E * 



T(p,h)(X(t AT m ))J <pt + h{x). 
Combining Assumption 12.31 the monotone convergence theorem and equation (|66|) . it follows that 

m-E U ^ \ [ t \f(X(s),u(X(s)))]ds+ [ r{X(s-),u 9 (X(s-)))dp*(s) 

<p 

showing that Ja{ x ) ^ <A(U%; ,x) < p. However, according to Theorem 1 in [19, chapter 5] we 
have that lim Q | otJ^{x) < Ja{x)- Consequently, from Proposition 14.21 it follows that p < Ja{ x )i 
completing the proof. □ 



References 

[1] A. Almudevar. A dynamic programming algorithm for the optimal control of piecdewise deter- 
ministic Markov processes. SIAM J. of Control and Optim., 40(2):525-539, 2001. 

[2] D.P. Bertsekas and S.E. Shreve. Stochastic optimal control, volume 139 of Mathematics in Science 
and Engineering. Academic Press Inc., New York, 1978. The discrete time case. 

[3] O.L.V. Costa. Average impulse control of piecewise deterministic processes. IMA J. Math. Control 
Inform., 6(4):375-397, 1989. 

[4] O.L.V. Costa and F. Dufour. Average control of piecewise deterministic Markov processes. ArXiv, 
0809.0477vl, page 34, 2008. Available at http://arxiv.org/abs/0809.0477. 

[5] O.L.V. Costa and F. Dufour. Relaxed long run average continuous control of piecewise deter- 
ministic markov processes. In Proceedings of the European Control Conference, pages 5052-5059, 
Kos, Greece, July, 2007. 

[6] M.H.A. Davis. Piecewise-deterministic Markov processes: A general class of non-diffusion stochas- 
tic models. J. Royal Statistical Soc. (B), 46:353-388, 1984. 

[7] M.H.A. Davis. Control of piecewise-deterministic processes via discrete-time dynamic program- 
ming. In Stochastic differential systems (Bad Honnef, 1985), volume 78 of Lecture Notes in 
Control and Inform. Sci., pages 140-150. Springer, Berlin, 1986. 

[8] M.H.A. Davis. Markov Models and Optimization. Chapman and Hall, London, 1993. 

[9] M.A.H. Dempster and J.J. Ye. Necessary and sufficient optimality conditions for control of 
piecewise deterministic processes. Stochastic and Stochastics Reports, 40:125-145, 1992. 

[10] M.A.H. Dempster and J.J. Ye. Generalized Bellman-Hamilton-Jacob optimality conditions for a 
control problem with boundary conditions. Appl. Math. Optimization, 33:211-225, 1996. 



22 



[11] E.B. Dynkin and A. A. Yushkevich. Controlled Markov processes, volume 235 of Grundlehren der 
Mathematischen Wissenschaften. Springer- Verlag, Berlin, 1979. 

[12] L. Forwick, M. Schal, and M. Schmitz. Piecewise deterministic Markov control processes with 
feedback controls and unbounded costs. Acta Appl. Math., 82(3):239-267, 2004. 

[13] D. Gatarek. Impulsive control of piecewise-deterministic processes with long run average cost. 
Stochastics Stochastics Rep., 45(3-4) :127-143, 1993. 

[14] X. Guo and U. Rieder. Average optimality for continuous-time Markov decision processes in 
polish spaces. The Annals of Applied Probability, 16:730-756, 2006. 

[15] X. Guo and Q. Zhu. Average optimality for Markov decision processes in Borel spaces: A new 
condition and approach. Journal of Applied Probability, 43:318-334, 2006. 

[16] O. Hernandez-Lerma and J.B. Lasserre. Discrete-time Markov control processes, volume 30 of 
Applications of Mathematics. Springer- Verlag, New York, 1996. Basic optimality criteria. 

[17] O. Hernandez-Lerma and J.B. Lasserre. Further topics on discrete-time Markov control processes, 
volume 42 of Applications of Mathematics. Springer- Verlag, New York, 1999. 

[18] M. Schal. On piecewise deterministic Markov control processes: control of jumps and of risk 
processes in insurance. Insurance Math. Econom., 22(1):75-91, 1998. 

[19] D.V. Widder. The Laplace Transform. Princeton Mathematical Series, v. 6. Princeton University 
Press, Princeton, N. J., 1941. 

[20] A. A. Yushkevich. Bellman inequalities in Markov decision deterministic drift processes. Stochas- 
tics, 23:235-274, 1987. 

[21] A. A. Yushkevich. Verification theorems for Markov decision processes with controlled determin- 
istic drift and gradual and impulsive controls. Theory Probab. Appl, 34(3):474-496, 1989. 

[22] Q. Zhu. Average optimality for continuous-time Markov decision processes with a policy iteration 
approach. Journal of Mathematical Analysis and Applications, 339:691-704, 2008. 



23 



